Semantic Annotations for Biology: a Corpus Development Initiative at the Jena University Language & Information Engineering (JULIE) Lab
نویسندگان
چکیده
We provide an overview of corpus building efforts at the Jena University Language & Information Engineering (JULIE) Lab, which are focused on life science documents. Special emphasis is laid on semantic annotations in terms of a large amount of biomedical named entities (almost 100 entity types), semantic relations, as well as discourse phenomena, reference relations in particular.
منابع مشابه
The CALBC Silver Standard Corpus - Harmonizing multiple semantic annotations in a large biomedical corpus
The CALBC initiative aims to provide a large-scale biomedical text corpus that contains semantic annotations for tagged named entities of different kinds. The generation of this corpus requires that the annotations from different automatic annotation systems are harmonized. In the first phase, the annotation systems from 5 participants (EMBL-EBI, EMC Rotterdam, NLM, JULIE Lab Jena, and Linguama...
متن کاملBioTop and ChemTop - Top-Domain Ontologies for Biology and Chemistry
Holger Stenzhorn Stefan Schulz University Medical Center Freiburg Institute for Medical Biometry and Medical Informatics Stefan-Meier-Straße 26 79104 Freiburg, Germany [email protected] [email protected] Elena Beißwanger Udo Hahn University Language and Information Engineering (JULIE) Lab Fürstengraben 30 07743 Jena, Germany [email protected] udo.hahn@uni-j...
متن کاملThe GeneReg Corpus for Gene Expression Regulation Events - An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability
Despite the large variety of corpora in the biomedical domain their annotations differ in many respects, e.g., the coverage of different, highly specialized knowledge domains, varying degrees of granularity of the targeted relations, the specificity of linguistic grounding of relations and named entities referred to in the documents, etc. We here introduce GENEREG (Gene Regulation Corpus), the ...
متن کاملThe JULIE LAB MANTRA System for the CLEF-ER 2013 Challenge
We here describe the set-up for the system from the Jena University Language & Information Engineering (JULIE) Lab which participated in the CLEF-ER 2013 Challenge. The task of this challenge was to identify hitherto unknown translation equivalents for biomedical terms from several parallel text corpora. The languages being covered are English, German, French, Spanish and Dutch. Our translation...
متن کاملTowards Enhanced Interoperability for Large HLT Systems: UIMA for NLP
We introduce JCORE, a full-fledged UIMA-compliant component repository for complex text analytics developed at the Jena University Language & Information Engineering (JULIE) Lab. JCORE is based on a comprehensive type system and a variety of document readers, analysis engines, and CAS consumers. We survey these components and then turn to a discussion of lessons we learnt, with particular empha...
متن کامل